Optimal estimates of free energies from multi-state nonequilibrium work data 
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We derive the optimal estimates of the free energies of an arbitrary number of thermodynamic 
states from nonequilibrium work measurements; the work data are collected from forward and 
reverse switching processes and obey a fluctuation theorem. The maximum likelihood formulation 
properly reweights all pathways contributing to a free energy difference, and is directly applicable 
to simulations and experiments. We demonstrate dramatic gains in efficiency by combining the 
analysis with parallel tempering simulations for alchemical mutations of model amino acids. 



Introduction. — Free energy simulations play an ever 
increasing role in studies of condensed matter, partic- 
ularly those concerned with problems in molecular bio- 
physics. With the advent of accurate force fields, in- 
creases in computer power, and continuous methodolog- 
ical advances, "free energy simulations [have] come of 
age." 1] Due to the flexibility of the design and control 
of the simulations at the atomistic level, they can provide 
information not available from experiment. In parallel 
developments, a new generation of single molecule stud- 
ies 0, 13 are using advances in theory 4, J,, ^, i ^] to 
extract free energy information from the experiments. A 
recent example is a single molecule pulling experiment 
that determined the folding free energies of RNA strands. 
The measured unfolding and refolding non-equilibrium 
work data were analyzed with a generalization of Ben- 
nett's acceptance ratio method to finite switching 7]. 
The Bennett method and its generalization were shown 
recently to provide the maximum likelihood estimator of 
the free energy difference, given a set of work data be- 
tween two states 0. Surprisingly, the Bennett method, 
which dates from 1976, has rarely been used in computa- 
tions of free energy differences in biological systems; cal- 
culations have been based primarily on the exponential 
difference formula of Zwanzig |lOl | or the thermodynamic 
integration approach of Kirkwood ■ 

Maximum likelihood estimators, under very general 
and verifiable conditionSjare asymptotically consistent 
and efficient estimators [l^; i-e. they provide the small- 
est variance of any unbiased estimate of the parameters 
underlying the distribution of a large set of sampled data. 
In this letter we provide the maximum likelihood estima- 
tor of free energy differences from nonequilibrium work 
data sets for multiple states and verify that it is asymp- 
totically unbiased and efficient. We also show that the 
method can be used to combine all of the sampled data 
from parallel tempering simulations [l^ Il4l | in an opti- 
mum way to obtain free energy differences. The analysis 
is directly applicable to present-day simulations of free 
energy differences of interest in molecular biophysics; e.g. 
the properties of mutants or the study of multi-ligand 
binding. In what follows we first derive the method and 



then demonstrate its utility in the analysis of parallel 
tempering simulations of alchemical mutations of model 
amino acids. 

Derivation. — Consider a set of N thermodynamic 
states, which correspond to systems with different Hamil- 
tonian, possibly sampled at different external conditions 
(e.g. variation of the temperature). A switch from one 
state to another can be produced either by an instanta- 
neous (sudden) change of the Hamiltonian and the ex- 
ternal conditions with the two systems in the same mi- 
crostate 4, 10|, or a sequence of gradual changes that 
lead from the Hamiltonian and external conditions of the 
initial state to the final state j^, 0, ^3 ■ 

Consider the pair of thermodynamic states i and j and 
a forward and reverse switch, such that a fluctuation the- 
orem 0, 0, ^1 of the following form holds for the prob- 
ability distributions of a path-dependent quantity Wij 
that is odd under path reversal [Wji = —Wij): 

\i ^ jOe^'^- = piW,, \j ^ i)e-4- , (1) 

with p{Wij \i j) the probability of measuring Wij along 
a path sampled in the switch from i to j, and Aij an ap- 
propriate state-dependent quantity; Wij can be thought 
of as a generalization of the work and Aij as the free en- 
ergy difference. In the instantaneous switch case, Wij, 
and Aij are: 

A,j = InZ, -InZj, (2) 

with X a microstate (coordinates and momenta) sampled 
at equilibrium in the initial ensemble, Ej the energy of 
state j, (3j the inverse temperature of bath j, InZj the 
logarithm of the partition function of state j; with these 
definitions Eq.^holds, as shown in Ref 0. In the gradual 
switch case performed in contact with a constant tem- 
perature heat bath, Wij{x) = PW, with W the work 
along the pathway, and Aij = (3/S.Fij, with Ai^^j the 
Helmholtz free energy difference between states i and j. 
More general situations for which Eq. holds are dis- 
cussed in Ref. 6, 15). 

We follow the analysis of Shirts et al (see also 
Ref. ^^) to obtain the conditional probability that a 
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work value Wy- along a path between states i and j re- Inp: 
suited from a sampling of a forward {i —>■ j) switching 
process. From Bayes theorem the ratio of proba- 
bilities of the forward to the backward directions of the 
switch given the work value and the end states i, j is: 



p{i^j\W,j) ^ p{W^j\i j)p{i ^ j) 
Pij^im,) piW,,\j ^ i)pij ^ i)' 



(3) 



with p(i j) the probability that the path between 
states i and j was sampled in the direction from i to j; 
for notational simplicity we omit the explicit dependence 
of all probabilities on the given pair of thermodynamic 
states and on Aij. A given work value between 

the states i and j can be sampled in either the direc- 
tion i j or the direction j — » i, so that the numerator 
and denominator of the left hand side of Eq. Osum to 1. 
Following Ref. 9l, Eq. Eland Eq.[T] can be used to obtain: 



Mj-i = Ini 



— In: 
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with / the Fermi function f{x) = 1/(1-1- e^), and n'°* 
the total number of uncorrelated work data in the direc- 
tion i —>■ j. For the purpose of the maximum likelihood 
estimate of In Zi, the ratio p{i — > j)/p{j — > i)can be sub- 
stituted with nlf/n]f without loss of rigor ji,'l6'|. Eq.H 
resembles Bennett's acceptance ratio of a switch move 
in a simultaneous Monte Carlo sampling of i and j that 
minimizes the variance of their free energy difference. 

We can now write the joint likelihood p of observing 
forward switches from all states i to every other state j 
given the work data between these states as: 



P 



(6) 



with Wij^riij the work values sampled along the n.^ paths 
for the switches i ^ j. This equation, which follows 
from Eq. 0| for independent work data, also gives the 
probabilities of partitioning the work data for all pairs of 
states into those resulting from forward and from reverse 
switches, since the reverse process with i > j corresponds 
to a forward process with j > i. 

We now determine the set of In Zi that maximizes the 
probability in Eq. El or, equivalently the logarithm of 
this probability. This is the central result of the present 
development. As can be seen from Eq. [21 Eq. |S1 remains 
invariant when we multiply every partition function by 
the same constant. We can thus fix one of the InZ^ (e.g. 
set In Zi to zero) and maximize the logarithm of Eq. 
with respect to the remaining ones. Since the derivatives 
of the objective function are available in closed form to all 
orders, we can use the Newton- Raphson method to search 
for a stationary point. Using the properties of the Fermi 
function: dlnf{x)/dx = -fi~x) and f'{x) = -1/(2 + 
2 cosh x), we obtain the first and second derivatives of 
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dlnZa ' 
d^lnp 
d\n Zad\n Zb 
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2 -f 2 cosh (W^y, - ~ M,j) 



with a, h the indexes of the states with respect to which 
we obtain the derivatives and 5i^a the Kronecker delta 
symbol. The full set of components of the forces, Fa 
(Eq. Ej), and of the Hessian, Hab (Eq. (HI), can be effi- 
ciently calculated in a single scan through all pairs of 
states due to the sparseness of the array {<7° }. Fur- 
thermore, the Hessian of Eq. O is always negative def- 
inite; i.e. direct substitution of g"-, q^^ in Eq. O gives 
that for an arbitrary nonzero vector y of TZ^ the product 



y^Hy ^ J2 tij iVi 
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This result is strictly smaller 



than zero if at least one of the {yi — yj ) is different from 
zero. The latter is always the case if we fix one of the 
components of y to zero, which is possible given the in- 
variance of Eq. jSl to rescaling of all the partition func- 
tions. This means that the logarithm of Eq. (Hlonly has a 
single stationary point, which is a maximum. The maxi- 
mum of the probability is obtained efficiently by iteration 
of the Newton-Raphson method: 

(lnZ,)„^, = (ln^a)„-7EKV)„(^^'')„- (9) 

b 

with (InZa)^ the values of the partition functions at it- 
eration n and 7 a scaling factor that limits the maximum 
step size and increases the radius of convergence. 

The limit of the sequence of Eq. gives the maxi- 
mum likelihood estimate of the logarithms of the parti- 
tion functions of all the systems and thus the maximum 
likelihood estimate of their free energies. The maximum 
likelihood estimate is asymptotically unbiased with the 
constraint that the free energies of the system cannot be- 
come infinite [13 (which is always the case in practice). 
Furthermore, the current estimator asymptotically has 
the minimum variance since the third order derivatives 
of the log likelihood are finite for any values of the work 
or the parameters Thus, for the limit of a large data 
set the present analysis gives the optimal asymptotically 
unbiased estimate of the free energies. 

For iV = 2 systems the equation Fa ~ formally equals 
that of the Bennett's acceptance ratio method and thus, 
as Bennett showed Q it converges asymptotically to the 
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FIG. 1: Sampled thermodynamic states for capped amino 
acids valine {V), threonine (T), and asparagine (A''). The 
amino acids are modeled with the CHARMM energy func- 
tion [13. using a parametric potential: 1/(Av,At,An) ~ 

V,T,N V,T,N 

V'' + Vi^+ E V]^jr+ E Ax(Vl'f + V^i^),withAv, At, 

X X 

and An the parameters that scale the non-bonded and electro- 
static energies of the amino acid of interest, V'' the bonded 
energy terms, T^ee t^i^ non-bonded energy terms of the en- 
vironment E (backbone), Vxj^ and Vxx the van der Waals 
and electrostatic energy terms within the side chain of amino 
acid X (from the set {V,T,N}), and Vx% the non-bonded 
energy terms of side chain X with the environment. We im- 
plement the potential with the BLOCK facility |20ll of the 
CHARMM program 13. The bold side chains have Ax = 1, 
the dashed ones have Xx = 0; for example the state labeled 
T has (Av, At, An) = (0, 1, 0). The lower part of the picture 
shows several thermodynamic states from the parallel tem- 
pering simulation. The arrows show a small subset of all the 
possible switching pathways that contribute to the evaluation 
of the 300 K free energy differences. 

free energy perturbation method T^l (or the Jarzynski 
equality |5|) in the limit of equilibrium sampling from 
only one system. 

Example. — We illustrate the method by applying 
it to the analysis of three parallel tempering simula- 
tions |l3l corresponding to alchemical mutations in 
vacuum between pairs of the capped amino acids ll^ va- 
line (V), threonine (T), and asparagine (N) as shown in 
Fig. ^ capped amino acids are widely used models in 
biophysical studies. Here, we determine the room tem- 
perature free energy differences AFv^t, and AFy^N 
and their variances as function of increasing the num- 
ber of pathways to demonstrate the power of the current 
approach. 

We sample the canonical ensembles of each amino 
acid using a molecular-dynamics-based parallel temper- 
ing algorithm 0, 0| that readily equilibrates each 
state despite the high rotational barriers of the x\ an- 
gles ll^. Parallel tempering is performed with 12 
heat baths Tq =255 K, Ti =300 K, = 300 (n * 50) 
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FIG. 2: Scatterplots of the 300 K free energy differences for 
the V— >T and V^N transitions (see Fig.0. Each scatterplot 
is build from 10,000 estimates of the free energy differences 
using random subsets of the work data. These subsets have 
length A'p for all possible pairs of the 3 x A't states of the 
three dipeptides and A^t heat baths. The A'^t = 1 set contains 
only the 300K ensembles of V, T, and N; the sets with more 
than one temperature contain the ensembles of the A^t lowest 
temperature heat baths. The symbol (X) marks the centroid 
of the set with the least scatter. 

K, n G {1, 2, . . . , 10}; the temperatures are controlled via 
Langevin dynamics with a friction coefficient of 40 ps~^. 
The integration time step is 2 fs; SHAKE constraints |2^ 
are applied for bonds involving hydrogen atoms. Every 
10 ps we attempt to swap either all the baths (T2i, ^21+1), 
or all the baths {T2i-\^T2i)^ with i integer. The length 
of each simulation is 360 ns (36000 swap attempts). We 
save snapshots every 2 ps and obtain equilibrated ensem- 
bles of 160,000 snapshots for each heat bath taken from 
the last 320 ns of the simulation. 

Fig.Elshows several scatterplots of the simultaneous es- 
timates of the 300 K free energy differences from equilib- 
rium samples at A't temperatures, using A^p data points 
for each transition. The centroid of the A^t^IS, A^p = 
5000 set is close to the centroid of each scatterplot, which 
shows that the estimate is already unbiased for A'p = 500. 
The estimates of the free energy at 300 K from the co- 
ordinates of this centroid are AFy^T = —16.33 ± 0.01 
kcal/mol and AFy^N = -73.78 ±0.01 kcal/mol; the val- 
ues of the free energy are lower for the larger groups (see 
Fig. ^1, essentially due to their more negative energies. 

A summary of the scatterplots of Fig. [5] and of other 
types of analysis of the sampled data is presented in Ta- 
ble The free energy perturbation method is sys- 
tematically biased for the small samples |2^ and gives in- 
correct results even when the complete data set is used. 
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0.60 
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-17.14 
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0.30 
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0.44 
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500 


-16.32 


0.11 


-73.79 


0.16 
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-16.32 


0.08 


-73.79 


0.10 
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-16.33 


0.04 


-73.78 


0.05 
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-16.33 


0.04 


-73.78 


0.04 
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0.09 


-73.78 


0.13 
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-16.32 


0.03 


-73.79 


0.05 
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4 


5000 


-16.32 


0.02 


-73.79 


0.03 




8,12 


5000 


-16.33 


0.01 


-73.78 


0.01 



"Free energy perturbation, initial; ''Free energy perturbation, final; 
■^Pree energy perturbation, average; ''Bennett's acceptance ratio; 
^Multi-state acceptance ratio. 

TABLE I: The 300 K free energy estimates in kcal/mol from 
equilibrium samples of A^'s peptides at At temperatures, using 
Ap data points for each transition. The standard deviations a 
were calculated from 10,000 random samples; those in paren- 
theses are analytical estimates from one sample Q. 

The average of the forward and backward data 24] is 
somewhat better but still considerably less accurate than 
the Bennett acceptance ratio analysis of the same data 
set. Inclusion of more pathways in the multi-state accep- 
tance ratio analysis improves the statistics and keeps the 
estimate consistent. The striking feature of this table is 
the scaling of the multi-state acceptance ratio standard 

deviation a; it is approximately proportional to N^^Np ^ 
for a wide range of temperatures. 

To estimate the efficiency of this method, we used ran- 
dom subsets of the microstates, instead of the random 
subsets of the work used for Fig.[21and Tabled We picked 
500 random structures from each thermodynamic state to 
create the arrays of the work for all pathways; these work 
data are correlated. The estimate of AFv-,n from 10,000 
repetitions of this procedure is: Ai^y^N — —73.78 ±0.10 
kcal/mol. This corresponds to a reduction of the stan- 
dard deviation by a factor of 5.1 compared to that of 
the Bennett acceptance ratio between V and N at 300 K 
shown in Tabled (entry 2"^, 1, 500); thus, the analysis of 
1 ns of total simulation obtains the same accuracy as 26 
ns analyzed with the Bennett method between the two 
end states. 

Conclusions. — We have presented the asymptotically 
optimal way to obtain the free energy differences from 



samples of the work data between multiple states. The 
resulting multi-state acceptance ratio method is numer- 
ically efficient and stable. We have demonstrated the 
applicability of this approach to the analysis of parallel 
tempering simulations and have shown that it provides 
estimates of the resulting free energy differences that are 
more precise and accurate than those those from the Ben- 
nett method between two states. We are applying the 
approach to a range of problems (e.g. relative solvation 
free energies of amino acids). The method can be used 
also to enhance multi-ligand binding simulations or the 
analysis of experimental work data [3j Hi-el's an extension 
of the approach of Hummer and Szabo |a| . 
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